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ABSTRACT 

The  methods  proposed  by  Karst  and  Sharpe  of  fitting  a line  to  a given 
set  of  points  in  the  plane  so  as  to  minimize  the  sum  of  the  absolute  values  of 
the  deviations  are  examined  by  means  of  their  linear  programming  formulations. 

It  is  seen  that  the  Karst  algorithm,  which  was  .criticized  by  Sharpe,  already 
contains  the  type  of  improvement  proposed  by  Barrodale  and  Roberts  for  i ^ 
linear  approximations.  For  the  single  purpose  of  finding  the  optimal  line  the 
Karst  procedure  appears  to  be  very  efficient.  If,  as  Sharpe  suggests  may 
sometimes  be  the  case,  the  investigator  is  interested  in  the  sensitivity  of  the 
minimum  sum  to  changes  in  the  slope  parameter,  then  Sharpe's  algorithm  is 
preferred.  The  Karst  algorithm  is  improved  by  incorporating  into  it  the  simplex 
stopping  rule.  The  problem  is  generalized  to  permit  arbitrary  weightings  of 
the  deviations. 
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A DEFENSE  OF  THE  KARST  ALGORITHM  FOR  FINDING  THE 
LINE  OF  BEST  FIT  UNDER  THE  NORM 
Donald  R.  Schuette 

1.  Introduction 

Curve  fitting  so  as  to  minimize  the  sum  of  the  absolute  deviations,  i.e.  under  the 
i j-norm,  has  received  the  attention  of  a number  of  writers  in  recent  years  (e.g.  [ 1 ] , [ 3| , 

[7],  [B]).  In  particular,  Karst  [4]  and  Sharpe  [9]  have  proposed  algorithms  for  fitting  a 
line  to  a given  set  of  points  in  the  plane  under  that  criterion.  Rao  and  Srinivasan  ( 5]  have 
Interpreted  Sharpe's  method  as  the  solution  to  the  parametric  dual  to  the  linear  programming 
formulation  of  the  problem  and  have  offered  an  alternate  procedure  parameterizing  on  the 
intercept  rather  than  on  the  slope  as  in  the  case  of  Sharpe. 

In  this  paper  the  linear  programming  formulations  of  Karst's  and  Sharpe's  algorithms  will 
be  examined  and  compared.  It  will  be  seen  that  one  iteration  of  the  Karst  algorithm  is 
equivalent  to  several  simplex  iterations  in  the  same  manner  as  in  the  "improved  algorithm 
of  Barrodale  and  Roberts.  The  Karst  algorithm  will  be  improved  by  incorporating  into  it  the 
simplex  stopping  rule.  The  problem  will  be  generalized  to  allow  for  assignment  of  arbitrary 
weights  to  the  deviations  at  each  point. 


2.  Geometric  Considerations 


Given  a set  of  points  and  non-negative  weights  w^  for  i = 1,2,...,N,  j 

the  problem  is  to  find  values  of  the  parameters  a and  b so  as  to  minimize  ' ' 

N 

S(a,b)  = X - (a  + bX  )|  . 

i=l 

In  addition  to  the  (X,  Y)  plane  in  which  the  given  points  lie,  it  is  convenient  to 
consider  as  did  Karst  and  Sharpe  the  (a,  b)  plane  or  parameter  space  for  the  problem.  The 

surface  S(a,b)  lying  above  the  (a,b)  plane  is  piecewise  linear  and  convex  as  noted  by  | 

1 

Sharpe  and  Karst  (see  also  Rice  [6]).  Each  point  in  the  (a,  b)  plane  corresponds  to  a line  | 

in  the  (X,  Y)  plane.  The  line  a = r - sb  in  the  (a,b)  plane  corresponds  to  the  family  j 

of  lines  in  the  (X,  Y)  plane  through  the  point  with  coordinates  (s,r),  and  the  line  b = b'  j 

j 

In  the  (a,  b)  plane  corresponds  to  the  family  of  lines  in  the  (X,  Y)  plane  with  constant  i 

j 

slope.  Hence,  each  of  the  given  points  (^j>  determines  a line  a = Y^  - X,b  in  the 

(a,b)  plane.  Such  lines  will  be  referred  to  as  basic  lines  (see  Figure  l).  ! 


One  of  the  well  known  results  of  linear  approximation  theory  (Barrodale  and 


Roberts  [3],  Rice  [6])  is  that  there  is  an  optimal  solution  which  interpolates  at  least  two 
of  the  data  points.  That  means  that  there  is  an  optimal  point  in  the  (a,b)  plane  which 
lies  at  the  intersection  of  two  basic  lines.  When  there  is  only  one  optimal  point  it  must 


lie  at  the  intersection  of  two  basic  lines.  Hence,  the  justification  for  solution  search 
procedures  which  restrict  the  search  to  the  points  of  intersection  of  basic  lines. 


3.  Linear  Programming  Formulations 


It  is  well  known  that  the  problem  of  choosing  a and  b so  as  to  minimize 

N 

S(a,b)  = Y,  w |y  -(a+bX)|  (l) 

i=l  * 

can  be  formulated  as  a linear  programming  problem  (Wagner  [ 12],  Barrodale  and  Roberts  [3], 
Armstrong  and  Frome  ( ij).  Armstrong  and  Frome,  for  example,  in  the  case  when  w,  = 1 
for  all  i,  proceed  by  setting 

Pf^  - Nf  = Yf  - (a  + bX^) 

with  Pj  > 0 and  > 0 for  i = 1,  2, . . . , N.  Thus,  the  vertical  deviation  of  each  point 
from  the  line  with  parameters  a and  b is  divided  into  positive  and  negative  components. 
The  linear  programming  problem  then  becomes 


minimize  Y (Pj 

1=1  ‘ 


subject  to 


Pi  ■ ^ a . bX^  = Y^ 

Pj  > 0,  Nj  > 0 for  1 = 1,  2,  . . . , N , 

with  a and  b unrestricted  in  sign. 

The  dual  problem  (see  for  example  Wagner  [ 12])  is 

N 

maximize  Y 


subject  to 


N 

t = 0 


N 

.2.  ” 


-1  <Vj  < 1,  1 = 1,2, ...,N  . 


The  dual  is  a bounded  variable  linear  programming  problem  for  which  special  solution  methods 
are  available  (see  Taha  [ll)).  If  one  prefers,  the  substitution  ^ made  so 

that  the  problem  becomes 


N N 

maximize  ).  Y.t.  - X Y. 

i 1 1 


i«) 


i=l  i=l 


subject  to 


N 

2 X = N 


(9) 


i=  1 


N N 


0 <tj  <2,  i = 1,2 n . 

The  variables  are  then  non-negative  and  upper  bounded. 

In  the  more  general  case  where  the  deviations  are  weighted  by  the  factors 
primal  problem  is 

N 

minimize  /,  w (P.  + N.) 
i=l  * 

subject  to 

P - N,  + a + bX,  = Y. 
i i 11 

Pj  >0,  Nj  >0,  i = 1,2,  ...,N 

with  a and  b unrestricted  in  sign.  The  dual  then  becomes 

N 


N 

Z w = 0 


1=  1 


-Wj  < v^  < Wj,  I = 1,  2,  . . .,N 


(17) 

(18) 


Alternatively,  by  means  of  the  substitution  I-*’®  dual  problem  can  be 


written  as 


subject  to 


N IV 

maximize  X Y.t  - X Y.w 
j i i 1 i 


N 

z 

i=  1 


(19) 


N N 

Z t = z w 

i=l  i-l 

N N 


i=  1 


0 <t^  <2w.,  i = 1,2,  ...,N  . 


(20) 


(21) 


(22) 


As  before  the  dual  problems  (15)-(18)  and  ( i9)-(22)  are  bounded  variable  problems.  In 

the  case  of  problem  (19)-(22)  the  variables  are  non-negative  and  upper  bounded.  It  will  be 

appropriate  to  examine  the  nature  of  optimal  solutions  to  these  problems  as  provided  by 

the  theory  applicable  to  them  (see  Simonnard  [ 10],  Chapter  10  or  Taha  [ 11  ],  Chapter  6). 

First  to  be  considered  will  be  what  is  called  an  extremal  solution.  An  extremal  solution 

in  the  case  of  problem  (15)-(l8)  is  one  in  which  two  variables,  say  v.  and  v,,  are 

) ^ 

selected  to  be  basic  and  the  other  N - 2 variables  are  to  be  non-basic  and  set  equal 
to  one  or  the  other  of  their  extremal  feasible  values.  Associated  with  the  basic  variables 


are  the  basic  matrix  B = 


1 


1 


and  the  vectors  = (Y.,Y  ] and 

b ) ^ 


v„  = column  [ v,,  v,  ] . Index  sets  I,,  and  I,  may  be  defined  such  that  v.  = w,  if 
B j’  k U L 11 

i € I and  v = -w  if  i < I . Equations  (16)  and  (17)  then  lead  to  the  equations 

U 11  M 


-6- 


through  the  points  (X.,Y.)  and  (X^.Y^).  Condition  (26)  is  that  {X.,  Y.)  lies  above 
that  line  and  condition  (27)  is  that  (X^.Y^)  lies  below  that  line.  A method  of  generating 


passes 


-7- 


extremal  solutions  which  satisfy  the  optimality  conditions  of  problem  (l5)-(lti)  is  thus 


apparent:  select  a pair  of  points,  (X^,Y^)  and  (Xj^,  Y|^)  from  the  given  set  in  the  (X,  Y) 

plane  and  let  v^  and  v^  be  the  basic  variables:  for  the  points  which  lie  above 

the  line  set  the  corresponding  dual  variables  v^  = w.,  for  the  remaining  points  lying  on  or 

below  the  line  set  the  corresponding  dual  variables  v.  = -w..  The  values  of  v.  and  v 

11  J k 

are  then  determined  from  equation  (25). 

Of  course  the  dual  problem  optimality  conditions  are  nothing  other  than  the  feasibility 
conditions  of  the  primal.  An  extrema^  dual  solution  which  satisfies  the  optimality  conditions 
will  be  called  primal  feasible.  An  extremal  dual  solution  which  is  primal  feasible  and  dual 
feasible,  i.e.  satisfies  constraint  (18),  is  an  optimal  solution. 


f 

f: 


4.  Sharpe's  Algorithm 


Sharpe  proceeds  by  parameterizing  on  b.  With  b held  constant  at  b = b* 

N 

problem  is  then  to  minimize  S(a,  b')  = ^ ~ ~ ^i  ” 


i=  1 


N 


S(a.b')  = Z w Iy!  - al 
i=  1 


With  Pj  - = Y|  - a the  primal  linear  programming  problem  is 


N 


minimize  Z 


subject  to 


Pi  - Ni  + a = y; 


P.  > 0,  N > 0,  i = 1,2, 
1 — ’ i 


,N  . 


The  dual  problem  is 


N 


maximize  Z 
i=  1 


subject  to 


N 

Z ^1-0 


i=  1 


-Wj  < V.  < w.,  i = 1 , 2,  . . . , N , 


or  alternatively  with  ''''i 


N 

maximize  Z " Z 


subject  to 


N N 

E >.  = Z 

i=l  i=l 


0 < tj  < 2Wj,  i = 1, 2,  . . . , N . 


-9- 


* * * 


the 

so  that 


As  Rao  and  Srinivasan  [ 5]  have  noted  the  latter  is  a knapsack  problem  without  integer 


restriction  and  has  as  its  solution  a = where  Yj  is  the  median  among  the  values  of 

y relative  to  the  weights  w . From  the  point  of  view  of  bounded  variable  linear  programming 
i 1 

theory  that  means  starting  with  the  largest  value  of  Y|  set  the  corresponding  dual  variable 

t : 2w.  and  assign  i to  the  index  set  I . Continue  similarly  with  successively  smaller 
i i N ^ 


values  of  Y:  as  long  as  2A^  < ^ w,  = W.  An  index  j will  eventually  be  reached  such 

^ i=  1 

that  2(Ay  + w.)  > W.  Then  set  t.  = W - 2Ajj.  Clearly  the  constraint  0 < t . < 2w^  is 
then  satisfied.  For  any  remaining  index  i set  V =-  0 and  assign  i to  1^.  With 


A,,  + w.  + A,  = W,  it  follows  that 
U j L ’ 


t = A^  - Au  + w 


[3S) 


and  that  the  constraint  0 < t . < 2w.  is  equivalent  to  the  two  inequalities 

A„  > 


(36) 

(37) 


from  which  follows  the  interpretation  of  Yj  as  the  median  among  the  values  of  Yj  relative 
to  the  weights  w^. 

Thus,  at  optimum  problem  (32)-(34)  has  t^  as  the  single  basic  variable.  For  indices 
assigned  to  1^  the  corresponding  dual  variables  are  non-basic  and  set  equal  to  their  upper 
bounds,  and  for  indices  assigned  to  the  corresponding  dual  variables  are  non-basic 
and  set  equal  to  their  lower  bounds.  If  Yj  is  unique  among  the  values  of  Yj,  i.e. 

Y'  i-  Y'  for  any  other  index  i,  then  there  is  an  interval  of  values  of  b about  b'  such 

j f 

that  t continues  as  the  basic  variable  and  the  optimal  value  of  a lies  on  the  basic  line 
J 

a = Y - bX  . Sharpe  calls  that  line  the  border  for  those  values  of  b.  Let 
1 ) 

82(b)  = min{S(a,b)}.  Then  over  that  interval  Y^  - bX^  > Y.  - bX.  for  i ' atid 
a 

Y - bX.  < Y - bX,  for  i « 1,  • 
ill.)  L 


-10- 


Hence, 


82(b)  = S(Y.  - bX.,b) 

N 


1=1 


• ^ - ’'i  - '><’'1  - ’'I'l  - ^ - ''i  - ‘">‘1  - Z” 

= D„  - Dl  - b(C„  - C^)  - (V,  - bX.MAj,  - AJ  I ->i) 


where 


D„  = Z,  - w.Y.  and  D = Z , , w.Y,  . 

U i t ly  1 1 L i t ly  1 1 

Thus,  over  the  interval  of  values  for  which  t.  remains  basic  S^(b)  is  linear  in  b wiih 
’ 1 2 

slope 


• 


(39) 


It  may  be  noted  that  formula  (39)  also  follows  from  the  interpretation  of  the  dual  variables 
at  optimum  as  the  rate  of  change  of  the  objective  function  with  respect  to  the  right  hand  side 
quantities  of  the  associated  constraints. 

Over  its  entire  domain  ^^(b)  is  piecewise  linear  and  convex  with  changes  in  slope 
occurring  whenever  the  border  intersects  another  basic  line.  For  a value  of  b corresponding 
to  such  an  intersection  ^^(b)  is  not  differentiable  and  formula  (39)  does  not  hold.  Sharpe 
points  out  that  if  for  b = b'  the  slope  of  S2(b)  is  negative,  the  optimal  value  of  b is 
greater  than  b'.  Vice  versa  if  the  slope  is  positive.  The  algorithm  consists  in  following 
the  border  to  an  intersection  with  another  basic  line  a = Y^  - bX^  at  which  the  slope  of 
82(b)  changes  sign.  The  coordinates  of  that  point  of  intersection  are  the  optimal  valui.s 
of  a and  b.  Each  time  the  border  reaches  another  basic  line  it  must  be  determined 
whether  the  border  follows  the  new  basic  line  or  continues  along  the  old  line.  That  probleir 
will  now  be  considered. 


J 


-n- 


Suppose  b'  corresponds  to  the  point  of  intersection  of  the  two  basic  lines 

a = Y - bX,  and  a = Y,  - bX,  , i.e.  Y.  - b'X  = Y,  - b'X,  . Consider  an  extremal  solution 
j j k Ic’  ) j k k 

to  problem  (19)-{22)  with  v.  and  v as  basic  variables  and 

J ^ 

Ay  + + w^.  + Wj_  = W . (40) 

Two  cases  must  be  considered.  First  suppose  X^^  > X^.  Then  for  b < b',  (Xj^,  Y^)  lies 
above  the  line  through  with  slope  b and,  hence,  k c ly  In  terms  of  the  values 

at  b = b'  conditions  (36)  and  (37)  for  a = Y^  - bX^  to  be  the  border  are 

A,,  + w,  < A,  + w.  (4) ) 

U k ~ L J 

Au  + Wk>Ay-Wj,  or  (42) 


"(Wj  + Wj^)  < Ay  - Ay  < w^  - W|^  . (43) 

For  b > b',  index  k transfers  from  ly  to  ly  and  the  conditions  become 

Ay<Ay  + Wk+w,  and 
Ay>Ay  + Wk-w.,  or 

w,  - w.  < A,,  - A,  < w,  + w,  . (44) 

k ) “ U L ~ J k 

In  order  for  a = Y - bX,  to  be  the  border  on  both  sides  of  the  intersection  both  (43)  and  (44) 
J J 


must  hold,  or 


-(w,  - w,  ) < A,  - A,  < w,  - w,  . 
' J k — U L “ J k 


Next  suppose  X^^  < X^.  Then  for  b < b',  (^>  ^1^)  below  the  line  through 

(X  ,Y.  ) with  slope  b and  hence  k t I, . In  terms  of  values  at  b = b'  the  conditions 
j’  k L 


for  a = Y - bX.  to  be  the  border  line  are 
j ) 


A-.  < A.  + 

w, 

+ w. 

and 

-J 

1 

P 

k 

1 

A._  > A_  + 

w, 

- w. 

or 

J 

1 

p 

k 

J 

w,  - w < A,,  - 

A, 

< w, 

- w. 

k j “ U 

L 

- k 

) 

For  b > b'  the  condition  is 

-(Wj+w^)<A^-Aj^<w,  -w^.  (-17) 

One  additional  result  is  required  in  order  to  implement  Sharpe's  algorithm  namely 
Formula  (39)  expressed  in  terms  of  values  at  b = b',  i.e.  when  equation  (40)  holds.  For 
b<b'  and  > X^,  (X^^,  Y|^)  lies  above  the  line  through  (X^.  with  slope  b.  Hence 
along  the  border  a = - bX^, 

S' (b)  = - Cy  - X.(Aj^  - Ay)  - - x^)  . 

Formulas  for  the  other  combinations  of  conditions  may  be  obtained  in  a similar  manner. 

The  following  table  summarizes  the  results  required  to  implement  Sharpie's  algorithm 
at  the  point  of  intersection  of  the  two  basic  lines  a = Y^  - bX.  and  a = Y^^  - bX  in  the 
(a,  b)  plane. 

Conditions  Border  Line  S' (b) 


b < b', 

-(w.+w^)  <Ay-Aj^ 

< w.-w, 
- ) k 

a = 

Y.-bX, 
) J 

«■ 

w.-Wi^<Ay-A^ 

< w.+w, 
- J k 

a = 

b>b', 

-(w.+w^)  < Ay-Aj^ 

< w,  -w. 
- k j 

a = 

‘ 

Wi^-w.  <Ay-A^ 

< w +w, 
- j k 

a = 

Y -bX, 
] J 

5.  Karst's  Algorithm 


Karst  proceeds  by  selecting  an  arbitrary  line  in  the  (a,b)  plane  and  following  it  until 
the  minimum  of  S(a,  b)  above  it  is  found.  The  initially  selected  line  may  or  may  not  be  a 
basic  line,  but,  it  will  be  seen,  that  minimum  does  occur  above  the  intersection  of  the 
initially  selected  line  with  one  of  the  basic  lines.  The  algorithm  continues  by  following  tne 
just  encountered  basic  line  until  the  minimum  above  it  is  found,  again  at  the  intersection 
with  a basic  line.  The  process  continues  until  the  basic  line  just  found  is  not  a new  one 
but  one  previously  encountered.  The  last  encountered  point  of  intersection  of  basic  lines 
has  coordinates  which  are  optimal  values  of  a and  b. 

In  terms  of  the  (X,  Y)  plane  the  algorithm  is  equivalent  to  selecting  an  arbitrary  initial 
point  and  solving  the  restricted  problem  of  minimizing  S(a,  b)  over  the  set  of  lines  which 
pass  through  that  point.  An  efficient  procedure  for  solving  the  restricted  problem  is  the 
key  element  in  Karst's  algorithm,  because  each  iteration  requires  the  solution  of  just  such 
a problem. 

One  linear  programming  formulation  of  the  restricted  problem  may  be  obtained  as  a 
special  case  of  problems  (1‘>)-(1B)  or  (19)-(22)  merely  by  the  elimination  of  one  constraint. 
The  argument  is  as  follows.  Suppose  that  the  initially  selected  point  is  one  of  the  original 
data  points  (X^,Yj.  (If  it  is  not,  include  it  as  the  (N  + l)st  point  with  = 0.) 

Then  the  restricted  problem  is  obtained  as  a special  case  merely  by  making  w^  very  large 
in  the  primal  problem,  because  then  an  optimal  solution  must  pass  through  (X^,  Yj.  In  the 
dual  problems  when  w^  is  very  large  the  constraints  -w^  £ < w^  and  0 < t^  5 2w, 

become  non-restrictive  and  thus  may  be  eliminated. 

Hence,  an  extremal  solution  of  problem  (15)-(lo)  which  has  v^  and  v^  basic  and 
which  is  primal  feasible  is  optimal  for  the  restricted  problem  if 

-w,  < V,  < w,  . 
k ~ k ~ k 


However,  from  formula  (25) 


\ = 


If  Xj^  > X.,  then  (4tf)  is  equivalent  to 

-w,(X^  - ^ - 

""U  - - '^k(\  - ^ ^ 


l-l'l) 

(50) 

(^1) 


With  X,  = X^  - X^  and  c^  = 2:  i € lyW^^.  and  = 2 i t formula  (49)  becomes 


^L'^U 

\ ■ X. 


and  inequalities  (5l)  may  be  written 


^U" 


(52) 


If  X < X, , then  {46)  is  equivalent  to 
} 

Thus,  a solution  to  the  Karst  restricted  problem  may  be  obtained  by  inspecting  the 

extremal  solutions  to  problem  (15)-(18)  which  are  primal  feasible  until  one  is  found  which 

satisfies  either  condition  (52)  or  condition  (53).  The  difficulty  is  that  of  generating  those 

solutions  in  a convenient  and  systematic  way.  One  approach  to  overcoming  that  difficulty 

will  now  be  developed.  Given  that  a = Y - bX 

J i 

Y.  - (a  + bX  1 = Y.  - (Y  - bX,  + bX.) 

1 i 1 j ) 1 

= yj  - bx.  . 

The  conditions  under  which  y^  > bx^  and  index  i therefore  assigned  to  ly  are  either 

^i  ^i 

b < — and  x > 0 or  b > — and  x,  <0.  Similarly  the  conditions  under  which  y.  < bx, 

X,  i X,  i ^ 'll 

i i yj  y^ 

and  index  i is  assigned  to  I,  are  either  b < — • and  x < 0 or  b > — and  x.  > 0, 

L Xj  i i 
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J 


The  various  situations  may  be  visualized  in  terms  of  a plot  of  the  given  points  in  the  (x,  y) 
plane  and  a line  with  slope  b passing  through  the  origin  in  that  plane. 


Figure  2 


index  2 is  assigned  to  I^j  because  ^ — and  x^  > 0;  index  3 is  assigned  to  ly 

because  b > — and  x,  < 0.  On  the  other  hand  index  1 is  assigned  to  1.  because 
X 3 L 

^1  ^ 

b > — and  x >0  and  index  4 is  assigned  to  I,  because  b < and  x < 0.  The 
X 1 Aj  X 4 

1 4 

foregoing  leads  to  the  following  procedure  for  solving  the  Karst  restricted  problem: 


1)  Compute  X.  and  y^  for  i = 1 , 2,  . . . , N,  i ^ j . Assign  i to  ly  if  x.  > 0 and 

i to  ly  if  x^<0.  (If  the  X.  values  are  not  all  distinct,  so  that  x,  = 0 is 

possible,  assign  i to  ly  when  x.  = 0 and  y.  > 0 and  i to  ly  when  x.  = 0 

and  y^  < 0). 

Vi 

2)  Compute  — except  when  x.  = 0 and  rank  in  ascending  order. 

^i  * ^k 

3)  For  the  smallest  value  of  — say  — , if  x > 0,  delete  k from  I.  and  test 

^i  '‘k  yj. 

for  sstisfsction  of  condition  (52).  If  (52)  is  satisfied,  then  b = “ is  optimal.  If 

^k 


not,  transfer  k to  I,  and  proceed  to  the  next  higher  value  of  — . If  x,  < 0,  delete 

Xj  k ’ 

k from  and  test  for  satisfaction  of  condition  (53).  If  (53)  is  satisfied,  then 
b = „ is  optimal.  If  not  transfer  k to  I,,  and  proceed  to  next  higher  value  of  — . 
4)  Continue  the  process  until  an  index  k is  found  for  which  condition  (52)  or  (53)  is 
satisfied. 

The  solution  to  the  restricted  problem  may  be  a solution  to  the  unrestricted  problem. 
From  equation  (25) 


- =u' 


■Xk(*L  - *Il'  - - g|l> 


X,  - X 
k j 


( 54) 


If  the  value  of  v^  satisfies  -w^  ~ optimal  solution  to  the  overall  problem 

has  been  found. 

Karst  approaches  the  problem  from  a point  of  view  not  directly  based  upon  a linear 
programming  formulation.  He  noted  that 


N 

G(b)  = S(Y.  - bX.,  b)  = w.  ly,  - bx, 
1 J i=i  ' ^ ‘ 


is  piecewise  linear  and  convex  as  a function  of  b.  Furthermore  the  individual  term 

y.  y. 

w,  ly,  - bx  I has  slope  -w  |x.  I for  b < — and  slope  w.  lx,  I for  b > — . Hence 
ill  ii  X X 


N 


N 


y. 


G(b)  has  slope  - X w lx.  I for  b<min<  — / and  slope  V w lx  I for  b > max/  — 

i=i  ‘ * i bj  i=i  * ‘ i 

y.  y. 

and  the  slope  of  G(b)  increases  by  2w,  |x.  I as  b increases  from  — - e to  ~ + i 

1 1 X,  X 

i 1 

where  t > 0.  The  index  k is  sought  for  which  the  slope  of  G(b)  changes  from  negative 

^'i 

to  positive.  Thus  by  ranking  the  values  of  in  ascending  order  and  successively  adding 

^i 

N 

2w  lx  I to  an  initial  value  of  - ^ I the  index  k may  be  found  which  changes 

‘=1  y 

the  total  to  a positive  quantity.  As  before  the  solution  is  then  b = — . Karst  proceeds 
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with  successive  applications  of  the  procedure  until  the  index  k so  obtained  is  one  previously 
encountered.  He  does  not  take  advantage  of  the  stopping  rule  of  the  simplex  or  dual  simplex 
algorithm  which  is  what  equation  (54)  and  the  following  inequality  accomplish.  It  is 
suggested  that  the  recognition  of  that  stopping  rule  represents  an  Improvement  to  the  Karst 
algorithm. 

An  alternative  linear  programming  formulation  of  the  restricted  problem  can  be  developed. 
If  a = Yj  - bX^  is  substituted  in  equation  (13),  the  primal  problem  that  results  is 

N 

minimize  Y,  w (P  + N ) 

i=l  * * * 


subject  to 


The  dual  problem  is 


subject  to 


with  '''j 


subject  to 


P^-N.+bx^  = yi 

Pj  >0,  N.  >0,  i = 1,2,...,N 


maximize  ^ y v. 

i=l  ‘ ‘ 


N 


-w,  < V < w. , or 
i “ i - r 


maximize 


I.  ■ ,1, 


^x  t = X w 0<t  <2w  , 1=1,2,  ...,N 

1=1  i=l 


The  latter  problem  appears  to  be  a knapsack  problem  for  which  a median  type  solution 
is  available.  However,  when  some  of  the  values  are  positive  and  some  negative, 
the  simple  median  procedure  does  not  apply.  Thus,  the  alternative  linear  programming 
formulation  does  not  appear  to  be  any  more  useful  than  the  previous  formulation. 

Barrodale  and  Roberts  ( 3]  have  proposed  an  algorithm  for  the  general  discrete  / ^ 
linear  approximation  problem.  Their  algorithm  is  based  on  the  primal  formulation  (minimiza- 
tion) and  takes  advantage  of  the  special  structure  of  that  formulation.  They  observe  that 
many  columns  in  the  full  simplex  tableau  need  not  be  explicitly  retained  and  may  be  inferred 
from  the  critical  columns  that  are  retained.  If  the  values  of  n parameters  are  to  be 
determined  (two  in  the  case  of  fitting  a line),  then  only  n columns  need  to  be  maintained 
from  one  iteration  to  another.  Furthermore  they  have  devised  a way  of  combining  several 
simplex  iterations  into  one  iteration  of  their  algorithm.  It  will  be  seen  in  the  examples  which 
follow  that  the  Karst  algorithm  when  (0,0)  is  the  initial  point  in  the  (X,  Y)  plane  is 
identical  to  the  Barrodale  and  Roberts  procedure  in  terms  of  path  in  the  (a,b)  plane  followq^ 
to  optimum. 


6.  Example 


Example  1:  A five  point  problem  with  values  given  in 
the  table  at  the  right. 


ll 

1 1 
2 2 


3 

4 

5 


1 

5 

10 

22 

Its 


1 

2 

6 

3 

2 


W = 14 


Sharpe's  Algorithm 

First  minimization  along  b = 0. 

W = 14 
-2w  = -6 

-2w^  = zl 
^ 4 

-2w  = -12 

-ts 


Y|  Rank 


1 

5 

10 

22 

Its 


4.  5 

5.0 

12.0 

4.0 


j = 3,  a = 10  and  a = 10  - 3b  is  the  border  line  for  an  Interval  of  values  b near  b = 0. 
ly  = {4,5},  = {1,2},  Ay  = 5,  Aj^  = 3,  Cy  = 22,  Cj^  = 5,  0^=102  and  = 1 . 

Hence  S2(b)  = Dy  - - Y^(Ay  - A^^)  - b[Cy  - - X^(Ay  - A^) ) = 7 1 - 1 lb. 

Since  S^(b)  = -11,  b must  be  increased.  The  value  of  b for  which  the  border  line  first 
intersects  with  another  basic  line  is  b = 4 at  the  intersection  with  a = 1«  - 5b.  At  this 
point  j = 3,  k = 5,  ly  = {4},  = {1,2},  Ay  = 3,  Aj^  = 3,  Cy  = 12,  Cj^  = 5,  Dy  = 65 

and  D =11.  With  w.  = 6 and  w = 2,  the  condition  which  holds  is 

Li  } ^ 

w - w<A  -A  <w  +w.  Hence,  a = 10  - 3b  continues  as  the  border  for  b > 4. 
k j U L*  k • j 

Also,  for  b > 4,  index  5 t and  S^Cb)  = 39  - 3b.  Thus  b = 4 and  a = -2  is  not 
optimal. 
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The  next  critical  value  of  b is  b = 4.5  which  occurs  for  k = 1 at  the  intersection 


1 

with  a = 1 - b.  = {4},  = {2,  5),  = 3,  = 4,  = 1 2,  = 1 4,  = 66  and 

D,  = 46.  Since  w,  - w.  < A,,  - A,  < w,  + w.  holds,  a = 10  - 3b  continues  as  the  border 
L k j - U L - k j 

for  b > 4.5.  However,  for  b > 4. 5,  index  1 c and  | 

S^(b)  = 21  + b , 

which  means  the  optimal  solution  has  been  found:  b = 4.  5,  a = -3.  5 and  S(a,b)  = 25.  5.  I 

Karst's  Algorithm 

Let  the  initial  fixed  point  in  the  (X,  Y)  plane  be  the  origin. 

Iteration  1:  Minimize  along  a = 0. 


i. 

li 

w. 
1 

>r| 

i 

5, 

Condition  (52) 
or  (53)  satisfied  ? 

1 

1 

1 

1 

1 

1 

45 

'zl 

0 

no 

2 

2 

5 

2 

2.  5 

44 

0 

3 

3 

10 

6 

3.  3 

44 

1 

1 

4 

4 

22 

3 

4.  5 

2 

rl 

40 

1 

4 

no 

5 

5 

18 

2 

3.  6 

40 

5 

3 

-18 

22 

5 

yes 

k = 3 

Iteration  2: 

Along 

a = 

10  - 

3b  with 

1 = 3. 

Conditions  (52) 

i 

X, 

1 

!i 

X. 

1 

1 

w.x^ 
1 i 

£u 

or  (53)  satisfied  ? 

1 

-2 

-9 

4.  5 

5 

4 

7 

zA 

-4 

2 

-I 

-5 

5 

3 

-4 

no 

3 

0 

0 

- 

3 

+ 4 
0 

4 

1 

12 

12 

1 

-2 

ziz2) 

5 

2 

8 

4 

3 

2 

yes 

i 


. ■ . k = 1 


"l  = 


2-3 

-2 


- <^L  - ^U>  1 


Since  -w^  < ”'"^3  “ '^3  ~ "^3’  solution  is  optimal  with  b = -i.  5 

and  a = -3.5  as  before. 

Barrodale  and  Roberts  Improved  Algorithm 

The  full  primal  linear  programming  version  of  this  problem  requires  fourteen  variables 
and  five  constraints  after  the  variables  unrestricted  in  sign  a and  b are  each  replaced 
by  the  difference  of  two  non-negative  variables.  However,  as  Barrodale  and  Roberts  have 
shown  the  normal  simplex  tableaux  contain  many  columns  which  need  not  be  maintained. 
Only  two  columns  corresponding  to  non-basic  variables  plus  the  column  of  current  solution 
values  are  needed.  Again  starting  with  (a,  b)  = (0,0)  the  sequence  of  condensed  tableaux 
Is  shown  below. 


Initial  Tableau 


Basis 

Costs 

a 

b 

R 

"l 

1 

I 

♦ 

1 

1 

^2 

2 

1 

2 

5 

6 

1 

3 

10 

3 

1 

4 

22 

2 

1 

5 

18 

Marginal  Costs 

45 

173 
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tion.  The  normal  simplex  pivot  (1  ) corresponds  to  b = 1,  but  b can  be  increased 
beyond  that  value  by  making  non-basic  and  basic.  That  is  accomplished  by 

subtracting  2w^  from  the  marginal  cost  row.  In  a similar  manner  will  be  replaced 


by  as  b increases  beyond  2.5.  Finally  b will  become  basic  replacing  P^ 

bringing  the  first  iteration  to  an  end.  The  marginal  cost  row  went  through  the  following 
sequence  of  values  as  and  became  basic. 


1 

a 

b 

R 

Initial  tableau 

14 

15 

173 

* 

^1 

replaces  P^ 

12 

43 

171 

^2 

replaces  P^ 

8 

35 

151 

Notice  that  if  the  value  of  b goes  beyond  ~ so  that  replaces  P^,  the  marginal  1 

cost  for  b becomes  negative  which  means  that  b should  not  increase  beyond  at 

this  iteration. 

Second  Tableau 


The  second  tableau  does  not  represent  an  optimal  solution  because  the  marginal  cost 


of  -a  is  positive  indicating  that  decreasing  a below  0 will  decrease  total  cost.  The 
2 

normal  simplex  pivot  (~  *)  corresponds  to  -a  = 2,  but  a can  be  decreased  further  by 
making  basic  replacing  P^.  The  marginal  cost  row  then  becomes 

za  P3  R 

Marginal  costs  1 -5  29 

2 

The  pivot  then  employed  (~  **)  brings  the  tableau  to  the  following: 

Third  Tableau 

Basis  Cos  s P^  P^  R 


3 

1 

-a 

0 

3. 5 

2 

2 

N, 

2 

1 

i 

. 5 

2 

2 

2 

b 

0 

i 

i 

4. 5 

2 

2 

1 

3 

P, 

3 

7 . 5 

4 

2 

2 

N. 

2 

1 

2 

1 

5 

Marginal  Costs 

i 

' 2 

I 1 

“T 

25.  5 

The  marginal  costs  of  P^  and  P^  are  negative,  and  the  marginal  costs  of  and 
can  both  be  inferred  to  be  - ^ from  the  relation:  sum  of  marginal  costs  of  P^  and 
N.  equals  -2w^.  Therefore,  the  solution  represented  by  the  third  tableau  is  optimal  with 
a = -3.  5,  b = 4.  5 and  S(a,  b)  = 25.  5. 

Comparison  of  the  Barrodale  and  Roberts  condensed  tableau  with  the  Karst  algorithm 
reveals  that  in  this  example  they  are  identical  in  terms  of  the  sequence  of  values  of  a and  b 
through  which  they  progressed  to  the  optimal  solution. 
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Example  2:  An  eleven  point  problem  with  values  as  shown. 


1 

!i 

1 

1 

34 

3 

2 

2 

24 

5 

3 

3 

31 

8 

4 

4 

40 

10 

5 

5 

30 

15 

6 

6 

49 

20 

7 

7 

48 

23 

8 

8 

48 

20 

9 

9 

67 

15 

10 

10 

58 

13 

1 1 

11 

67 

1 1 

W = 143 


b = 0 


i 

X. 

1 

Y. 
1 

w. 

X 

Y! 

1 

Rank 

1 

1 

34 

3 

34 

8 

2 

2 

24 

5 

24 

1 1 

3 

3 

31 

8 

31 

9 

4 

4 

40 

10 

40 

7 

5 

5 

30 

15 

30 

10 

6 

6 

49 

20 

49 

4 

7 

7 

48 

23 

49 

5-6 

8 

8 

48 

20 

48 

5-6 

9 

9 

67 

15 

67 

1-2 

10 

10 

58 

13 

58 

3 

1 1 

11 

67 

1 1 

67 

1-2 

143 


U 

^i-^7 

2 4 

oi 

1 

L. 

^i-^ 

X.  - 
1 4 

X - X „ 
i 10 

L 

2.  33 

2 

2.  67 

L 

4.  80 

8 

4.25 

L 

4.25 

9 

3.  86 

L 

2.  67 

- 

3 

L 

9.00 

-10 

5.  6 

U 

-1.00 

4.  5 

2.  25 

- 

— 

2.  67 

3.  33 

- 

0 

2 

5 

U 

9.  50 

5.  4 

-9 

U 

3.  33 

3 

- 

u 

4.75 

3. 86 

9 
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1 

w k 

_L 

a 

h'  A 

— ^ 

1 

Slone 

Chanaes 

s 

7 

23  8 

20 

4« 

0 59 

41  506  152 

1292 

b 

should  increase 

and  foi 

b>0,  7 is  the  border  index. 

-206 

7 

23  1 

3 

31 . 67 

2.33  59 

58  506  309 

806. 67 

b 

should  increase  and  for 

b>2.33,  7 

is  the  border  index. 

-172 

1 . 

7 

23  4 

10 

29.33 

2.67  62 

48  509  269 

4/Il 

749. 33 

b 

should  increase 

and  for 

b>2.67,  4 

is  the  border  index. 

-115 

4 

10  10 

13 

28 

3.0  49 

71  379  4 30 

7 1 1 

b 

should  increase 

and  for 

b>3.0,  10 

is  the  border  index. 

-109 

4r 

10 

13  7 

23 

24. 67 

3.33  59 

48  419  269 

674. 67 

b 

should  increase  and  for 

b>3.37,  7 

is  the  border  index. 

- 34 

10/Il 

7 

23  3 

6 

18.25 

4.25  59 

53  419  375 

643.  5 

for  b >4.  25, 

3 is 

the  border  index,  but  b should  not 

increase. 

1 54 

Thus,  the  optimal  solution  is  a = 18.25,  b = 4.25  and  S(a,b)  = 643.50. 

I 


Iteration  I:  Minimize  along  a = 0 


Iteration 


1 

w. 

X 

y, 

1 

_J. 

_L 

i 

1 

3 

1 

34 

34 

2 

5 

2 

24 

12 

3 

8 

3 

31 

10.  33 

-2:  w lx. 
i 1 

2w . „ lx  - 

1 = -979 

4 

10 

4 

40 

10 

! = 260 

10  lO 

5 

1 5 

5 

30 

3 

-719 

2w  lx,. 

1 = 320 

(■ 

20 

6 

49 

8.  17 

8 8 

-399 

7 

23 

7 

48 

6.  86^ 

2wjx^ 

1 = 1 50 

-249 

8 

20 

8 

48 

6 

2 

1 = 242 

9 

15 

9 

67 

7 . 44 

- 7 

58 

2w^  Ix.^  1 

1 = 322 

10 

1 3 

10 

1 

4315 

1 1 

1 1 

1 1 

67 

6.  09, 
4 

12 

0 

0 

0 

— 

.'  . k 

= 7 

Along 

a = 48 

- 7b, 

j = 

7. 

X 

y 

1 

U/L 

1. 

1 

X 

i 

-6 

-14 

2.  33 

U 

-2  W 

Ix.l 

= -288 

1 

1 

-5 

-24 

4.  80 

L 

2w 

X 1 

= 40 

6 

6 

-248 

-4 

-17 

4.25 

2w 

= 40 

-3 

- 8 

2. 67 

U 

o 

-208 

9.00 

2w 

X 1 

= 36 

-2 

-18 

L 

1 

1 

-172 

-1 

1 

-1.00 

U 

2w, 

X 1 

= 60 

4 

4 

-112 

0 

A 

U 

‘lO  ' 

= 78 

1 

0 

0 

L 

- 34 

2w^ 

X 1 

= 64 

2 

19 

9. 50 

U 

3 

30 

3 

10 

3.  33 

L 

4 

19 

4.75 

U 

• 

k = 

3 

Optimal  solution:  b 


a = 4«  - 7(4.  2s)  = ] 
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The  Barrodale  and  Roberts  method  will  again  follow  the  same  path  to  the  optimal 
solution  as  the  Karst  algorithm,  and  will  not  be  shown  here. 


7.  Conclusions 


The  Karst  algorithm  is  similar  to  the  Barrodale  and  Roberts  method  in  that  a single 
iteration  of  each  is  equivalent  to  several  standard  simplex  Iterations.  The  two  will  follow 
the  same  path  in  the  (a,b)  plane  to  optimum  starting  from  (0,0)  if  initiaiiy  a unit  change 
in  b produces  a greater  decrease  in  S(a,b)  than  a unit  change  in  a. 

For  the  single  purpose  of  finding  the  optimal  values  of  a and  b the  Karst 
algorithm  is  very  efficient  and  undoubtedly  involves  less  computation  than  Sharpe's  algorithm. 

The  Sharpe  algorithm  generates  the  optimal  values  of  a for  values  of  b near  the 
optimal  value  of  b,  and  from  them  the  values  of  ^^(b).  If  the  function  S^(b)  is  of 
interest,  then  the  Sharpe  algorithm  maybe  preferred.  Otherwise,  it  is  more  cumbersome 
than  Karst's  algorithm. 
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